the whole data

one country

51,383

87.24

711

84.74

4,241

7.20

66

7.87

1,502

2.55

17

2.03

1,771

3.01

45

5.36

crimination between countries based on genomics pattern

riminant analysis models were constructed to examine how

pattern deviation happened between four countries. The pair-

rimination power between countries was examined, such as the

ation power between USA and India, etc. Each discrimination

as constructed using the Lasso regression algorithm, which is a

near model. Figure 7.19(a) shows the ROC curves from these

It can be seen that all models demonstrated almost perfect

ation power, indicating that the genomics pattern of sequences

r countries may have a significant difference.

(a) (b)

a) The ROC curves of discrimination models constructed for discriminating

attern from one country against the other country based on 3-mer word library.

tmap of rankings of 64 words in six Lasso discrimination models.

e 7.19(b) shows the rankings of 64 3-mers (words) when

ating between sequences from one country against sequences

other country. There were six models in total for this investigation.

tmap shows that different words had different significant